Sarcasm Detection in Chinese Using a Crowdsourced Corpus
نویسندگان
چکیده
Based on the assumption that comment with positive sentimental polarity to a negative issue has high probability to be a sarcasm, we propose a simple yet efficient method to collect sarcastic textual data by crowdsourcing with social media and merging game with a purpose approach. Taking advantage of Facebook's reaction button, posts triggering strong negative emotion are collected. Next, by using PTT's search engine, we successfully connect PTT's comments to the collected posts in Facebook and build the sarcasm corpus. Based on the corpus data, the performance comparison of sarcasm detection between SVM with naïve features and Convolutional Neural Network models is conducted. An impressive accuracy rate and great potentials of the corpus are demonstrated.
منابع مشابه
Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing
The ability to reliably identify sarcasm and irony in text can improve the performance of many Natural Language Processing (NLP) systems including summarization, sentiment analysis, etc. The existing sarcasm detection systems have focused on identifying sarcasm on a sentence level or for a specific phrase. However, often it is impossible to identify a sentence containing sarcasm without knowing...
متن کاملA Large Self-Annotated Corpus for Sarcasm
We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for sarcasm research and for training and evaluating systems for sarcasm detection. The corpus has 1.3 million sarcastic statements — 10 times more than any previous dataset — and many times more instances of non-sarcastic statements, allowing for learning in both balanced and unbalanced label regimes. Each statement is furthe...
متن کامل"sure, I Did the Right Thing": a System for Sarcasm Detection in Speech
While a fair amount of work has been done on automatically detecting emotion in human speech, there has been little research on sarcasm detection. Although sarcastic speech acts are inherently subjective, humans have relatively clear intuitions as to what constitutes sarcastic speech. In this paper, we present a system for automatic sarcasm detection. Using a new acted speech corpus that is ann...
متن کاملPutting Sarcasm Detection into Context: The Effects of Class Imbalance and Manual Labelling on Supervised Machine Classification of Twitter Conversations
Sarcasm can radically alter or invert a phrase’s meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users’ self-declarations in the form of hashtags to label data, ...
متن کاملSarcasm Detection on Czech and English Twitter
This paper presents a machine learning approach to sarcasm detection on Twitter in two languages – English and Czech. Although there has been some research in sarcasm detection in languages other than English (e.g., Dutch, Italian, and Brazilian Portuguese), our work is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter corpus consisting of 7,000 manu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016